Skip to content

StatisticalPlots macro#1374

Open
pstaabp wants to merge 5 commits intoopenwebwork:developfrom
pstaabp:statistical-plots
Open

StatisticalPlots macro#1374
pstaabp wants to merge 5 commits intoopenwebwork:developfrom
pstaabp:statistical-plots

Conversation

@pstaabp
Copy link
Member

@pstaabp pstaabp commented Jan 30, 2026

This creates a StatisticalPlots macro, which contains the methods add_barplot, add_histogram, add_boxplot and add_scatterplot.

There are many options for each and there is documentation as well.

This also includes an add_rectangle method to the plots.pl macro which is a wrapper for the add_dataset for creating rectangles.

This is all in a single perl file, but can split out into modules like the plots macro.

I'm making this a draft and there are some additional options I'd like to add. I'm also open to renaming items and adding other plots before this goes in.

Here's a few test problems to show how to use it:

stats.zip

Note: updated zip file.

@somiaj
Copy link
Contributor

somiaj commented Jan 31, 2026

I am a little confused, is the lib/Plots/StatPlot.pm an initial attempt, before making this fully a macro? Seems the same object and methods (which are more robust) are all included in macros/graph/StatisticalPlot.pl, which is probably a better place, and the file in lib/Plots should be removed?

Copy link
Member

@drgrice1 drgrice1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't delved to deeply into this yet, but here are some initial observations.

Lines 100 and 105 are too long. Shorten them. Ideally the POD would be kept even shorter than the usual 120 character limit. I think a max of 100 characters is a better limit for that, maybe even the classical 80 character limit would be better. The POD is all textual and if you are trying to read it directly in the code it makes it easier to read if it is not so long. Generally, you have kept the lines to 100 characters or less, but there are a few others that are inconsistently longer.

I think you should eliminate the StatPlot.pm module entirely. There is not much code there, and so just put that directly into the StatisticalPlots.pl macro file. The macro file is not that large anyway (at least at this point compared to many others). The plots.pl macro is bigger, and it only has 5 lines of actual code!

next;
}
if (ref $image_item eq 'Plots::Plot') {
if (ref $image_item eq 'Plots::Plot' || ref $image_item eq 'Plots::StatPlot') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worthwhile to make this more general, for future extension macros. Maybe call your package Plots::Plot::StatPlot, and then have this only check if ref $image_item starts with Plots::Plot, that way this won't have to be updated each time someone wants to extend Plots?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think there are some other places that might need to be updated, I recall having to modify a few places to check for the ref being equal to Plots::Plot beyond just this place (for some older image macros).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this would be an isa call. Then any object that derives from a Plots::Plot object would work here. Unfortunately, to properly do that you need the Scalar::Util::blessed method which is not available here. The correct way to check if an object, say $object, derives from a particular class is if (blessed($object) && $object->isa('Parent::Package')).

When I was creating the SimpleGraph.pl macro, I almost made that package derive from the Plots::Plot package, and then wanted to make this code that way, but ran into the issue with the lack of the blessed method availability. I ended up going a different direction with the SimpleGraph.pl macro though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, line 2946 below needs to check the ref also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DO NOT add this ref check to either macros/core/VectorField2D.pl or macros/graph/unionImage.pl. There are checks for the Plots::Plot object in those macros.

The VectorField2D.pl macro usage doesn't make sense for this statistical graph macro.

As to the unionImage.pl one, I told @somiaj not to add that there. It should be removed. No one should be using that macro anymore, and it should be moved into the deprecated folder.

use strict;
use warnings;

use WeBWorK::Utils qw(min max);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed this. This definitely can't be. A PG package can not use a WeBWorK package or the methods from it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those should be in the PGstatisticsmacros.pl macro--I think there are functions in there that use a variation of that--I have some changes to that macro. Perhaps I'll move min/max into there as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, this is in PGauxillaryFunctions.pl, and I use these.

@drgrice1
Copy link
Member

OSX users need to learn to not include the __MACOSX directory in the archive files that they distribute!

Copy link
Member

@drgrice1 drgrice1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now that this pull request doesn't even use the lib/Plots/StatPlot.pm file. So just delete it, and ignore my comments in that file.

@drgrice1
Copy link
Member

Here are some issues seen with your example problems:

The histogram.pg example in your zip file is not working. It gives the warning Use of uninitialized value $options{"orientation"} in string eq at line 358 of [PG]/macros/graph/StatisticalPlots.pl. and the resulting graph looks quite bad.

The boxplot_vertical.pl problem does not show a vertical box plot. In fact the problem is identical to the boxplot.pl file. Even adding the orientation => 'vertical' option to the add_boxplot call does not make it so.

The scatter01.pg file is the new problem template, and not a scatter plot at all.

@pstaabp
Copy link
Member Author

pstaabp commented Jan 31, 2026

About the StatPlot.pm file. I didn't mean to commit that. Sorry I notice that it made it in.

Where's the __MACOS directory? MacOS has this .DSStore file, but that's not in here either.

About scatter01.pg, I forgot to save changes. Here's an update of those problems--I just added one--with some new functionality.

stats.zip

@pstaabp
Copy link
Member Author

pstaabp commented Feb 2, 2026

Adds some other functionality to StatisticalPlots.pl

  • Ability to change the outlier marks in a boxplot
  • Adds a whisker cap option to a boxplot.
  • Adds a cap_width option for these whiskers.
  • Adds custom tick labels for the x-axis. For example, replacing the ticks 3, 6, 9, with 'Red', 'Orange', 'Blue'.

About the custom tick labels. It seems to work, but may not be the best way to do this. This only works on the x-axis, but will adapt whatever we go with to the y-axis as well. Also, I haven't made this work with the tikz output.. Will do this before this is out of draft.

Here's some updated problems.

stats.zip

This contains the methods add_barplot, add_histogram, add_boxplot and
add_scatterplot.

There are many options for each and there is documentation as well.

This also includes the add_rectangle method to the plot.pl macro
which is a wrapper for the add_dataset for creating rectangles.
Cleanup of the POD.
Removal of Plots::StatsPlot->new function.  It wasn't needed.
Make sure Plots::StatsPlot objects are rendered in PGbasicmacros.pl.
- Ability to change the outlier marks in a boxplot
- Adds a whisker cap option to a boxplot.
- Adds a cap_width option for these whiskers.
- Adds custom tick labels for the x-axis.
Add beginnings of some color palettes.
Fix a typo in Data.pm
@pstaabp
Copy link
Member Author

pstaabp commented Feb 6, 2026

  • Added pie charts and donut/ring charts.
  • Added aliases mark_color, stroke_color and stroke_width to the stat plot to help distinguish other colors/widths.
  • Added the beginnings of a color_palette function to return a default color palette. Note there is only one there right now, but will add more .

Here is an update of the problems I've been testing with:

stats.zip

Added additional palettes.
Improvement for documentation.
@pstaabp
Copy link
Member Author

pstaabp commented Feb 24, 2026

Another update to include custom tick labels, some additional palettes and some documentation updates.

@pstaabp pstaabp marked this pull request as ready for review February 24, 2026 20:22
drgrice1 added a commit to drgrice1/pg that referenced this pull request Mar 25, 2026
…ots.pl` macro.

To set custom tick positions use the `tick_positions` axis option.  Set
that to a reference to an array containing the positions on the axis
that ticks are desired.  For example, `tick_positions => [ 2, 5, 9 ]`
will place ticks at positions 2, 5, and 9 on the axis.  Note that when
this option is used the `tick_delta`, `tick_scale`, and `tick_distance`
options are not used. So only the given tick positions will appear in
the graph.

To set custom tick labels use the `tick_labels` option.  Note that this
is not a new option, but now it accepts a new type of value.  Previously
this was purely boolean (0 or 1), and it only determined if tick labels
would be shown or not.  Now it can take a value that is a reference to a
hash.  The keys of the hash are tick positions, and the values are the
labels to be placed at those positons. Note that formatting of the label
must be done by the auther, and the `tick_label_format` option is
ignored for any label provided in this hash. If a major tick is not
listed in the hash, then the position will be used for the label and it
will be formatted according to the `tick_label_format` option.

This is intended to replace what is done in openwebwork#1374 and is a more flexible
approach than what is done there.  In that pull request the capability
for custom tick labels only is added, and it is extremely restrictive in
what it can do.  Only positive tick labels can be customized, and it
requires that the problem author label all major ticks (there is no
fallback and a tick is labeled "undefined" if one is missing).
Copy link
Member

@drgrice1 drgrice1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the changes for custom labels in the plots.pl macro and associated modules from this pull request in favor of #1389. The custom labels are not actually used by anything here, so it shouldn't be a big deal to do so.

The only place you do use it is in an example problem that you posted. Please test pull request #1389. That should more than meet the needs of what you do in that problem. Furthermore, the histogram code in the StatisticalPlots.pl macro could use it to ensure that the class ends are labeled. That is what I do in my problems that directly use the plots macro to construct histograms.

Copy link
Member

@drgrice1 drgrice1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a review of the add_rectangle method. This completes what I have seen for the changes of the plots.pl code. The typo fixes are really the only things that should stay as they are.

I will try to look at the StatisticalPlots.pl macro deeper soon.

Comment on lines +395 to +406
sub add_rectangle {
my ($self, $pt0, $pt2, %options) = @_;

Value::Error('The first point must be an array ref of length 2')
unless ref($pt0) eq 'ARRAY' && scalar(@$pt0) == 2;
Value::Error('The second point must be an array ref of length 2')
unless ref($pt2) eq 'ARRAY' && scalar(@$pt2) == 2;
# If the fill_color option is set, set the fill to 'self'.
$options{fill} = 'self' if $options{fill_color} && !defined($options{fill});
return $self->add_dataset($pt0, [ $pt2->[0], $pt0->[1] ], $pt2, [ $pt0->[0], $pt2->[1] ], $pt0, %options);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rather inconsistent with the rest of the Plot.pm code.

Most of the plots methods allow adding one or multiple of each object at a time. The add_rectangle method should do this as well.

No where in the plots code is the Value::Error method used to give messages about incorrect usage, and for good reason. Calling Value::Error dies and prevents the entire problem from rendering. That should really only be done in answer checkers where those exceptions are caught, and not for something like this where you really just want to notify the problem author of incorrect usage. A warn is certainly sufficient for that, and then the problem can still render. There is also no need for a separate message for the two points. Just give one.

In short change this to

Suggested change
sub add_rectangle {
my ($self, $pt0, $pt2, %options) = @_;
Value::Error('The first point must be an array ref of length 2')
unless ref($pt0) eq 'ARRAY' && scalar(@$pt0) == 2;
Value::Error('The second point must be an array ref of length 2')
unless ref($pt2) eq 'ARRAY' && scalar(@$pt2) == 2;
# If the fill_color option is set, set the fill to 'self'.
$options{fill} = 'self' if $options{fill_color} && !defined($options{fill});
return $self->add_dataset($pt0, [ $pt2->[0], $pt0->[1] ], $pt2, [ $pt0->[0], $pt2->[1] ], $pt0, %options);
}
sub _add_rectangle {
my ($self, $pt0, $pt2, %options) = @_;
unless (ref($pt0) eq 'ARRAY' && @$pt0 == 2 && ref($pt2) eq 'ARRAY' && @$pt2 == 2) {
warn 'A rectangle requires two points defined by length two array references.';
return;
}
$options{fill} = 'self' if $options{fill_color} && !defined $options{fill};
return $self->_add_dataset($pt0, [ $pt2->[0], $pt0->[1] ], $pt2, [ $pt0->[0], $pt2->[1] ], $pt0, %options);
}
sub add_rectangle {
my ($self, @data) = @_;
if (ref($data[0]) eq 'ARRAY' && ref($data[0][0]) eq 'ARRAY') {
return [ map { $self->_add_rectangle(@$_) } @data ];
}
return $self->_add_rectangle(@data);
}

Comment on lines +301 to +317
=head2 PLOT RECTANGLES

A rectangle can be plotted with the C<< $plot->add_rectangle >> method. This is a
convenience method as a shortcut for the C<< $plot->add_dataset >> method. The first
two arguments are opposite corners of the rectangle. All other arguments are passed
as options to the C<< add_dataset >> method.

The following makes a filled rectangle with a thicker blue border.

$plot->add_rectangle([2,1], [6,3],
color => 'blue',
width => 1.5,
fill => 'self',
fill_color => 'yellow',
fill_opacity => 0.1,
);

Copy link
Member

@drgrice1 drgrice1 Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with the other POD in this file. This should be much like the POD for the add_circle and add_arc methods. Those methods are really just dataset methods also, and so there is no need to remark on this being a convenience method that just calls add_dataset.

So change this to something like

Suggested change
=head2 PLOT RECTANGLES
A rectangle can be plotted with the C<< $plot->add_rectangle >> method. This is a
convenience method as a shortcut for the C<< $plot->add_dataset >> method. The first
two arguments are opposite corners of the rectangle. All other arguments are passed
as options to the C<< add_dataset >> method.
The following makes a filled rectangle with a thicker blue border.
$plot->add_rectangle([2,1], [6,3],
color => 'blue',
width => 1.5,
fill => 'self',
fill_color => 'yellow',
fill_opacity => 0.1,
);
=head2 PLOT RECTANGLES
A rectangle can be plotted with the C<< $plot->add_rectangle >> method. This
method takes two points which are opposite corners of the rectangle. Multiple
rectangles can be plotted at once by passing references to arrays of data for
each rectangle.
$plot->add_rectangle([$lower_left_x, $lower_left_y], [$upper_right_x, $upper_right_y], %options);
$plot->add_rectangle(
[[$lower_left_x1, $lower_left_y1], [$upper_right_x1, $upper_right_y1], %options1],
[[$lower_left_x2, $lower_left_y2], [$upper_right_x2, $upper_right_y2], %options2],
...
);

That also incorporates in adding multiple rectangles for the suggested code in my other comment.

Edit: Oops. I cut and pasted the wrong section of POD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants